智能论文笔记

MIPI 2022 Challenge on RGBW Sensor Re-mosaic: Dataset and Report

Qingyu Yang , Guang Yang , Jun Jiang , Chongyi Li , Ruicheng Feng , Shangchen Zhou , Wenxiu Sun , Qingpeng Zhu , Chen Change Loy , Jinwei Gu

分类：计算机视觉

2022-09-15

随着移动平台上对计算摄影和成像的需求不断增长，在相机系统中开发和集成了高级图像传感器与新型算法的发展。但是，缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像（MIPI）的发展。为了弥合差距，我们介绍了第一个MIPI挑战，包括五个曲目，这些曲目着重于新型图像传感器和成像算法。在本文中，引入了RGBW关节Remosaic和Denoise，这是五个曲目之一，在全面分辨率上进行了RGBW CFA插值的插值。为参与者提供了一个新的数据集，其中包括70（培训）和15个（验证）高质量RGBW和拜耳对的场景。此外，对于每个场景，在0dB，24dB和42dB上提供了不同噪声水平的RGBW。所有数据均在室外和室内条件下使用RGBW传感器捕获。最终结果是使用PSNR，SSIM，LPIPS和KLD在内的客观指标评估的。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接，请访问https://github.com/mipi-challenge/mipi2022。

translated by 谷歌翻译

MIPI 2022 Challenge on RGBW Sensor Fusion: Dataset and Report

Qingyu Yang , Guang Yang , Jun Jiang , Chongyi Li , Ruicheng Feng , Shangchen Zhou , Wenxiu Sun , Qingpeng Zhu , Chen Change Loy , Jinwei Gu

分类：计算机视觉

2022-09-15

随着移动平台上对计算摄影和成像的需求不断增长，在相机系统中开发和集成了高级图像传感器与新型算法的发展。但是，缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像（MIPI）的发展。为了弥合差距，我们引入了第一个MIPI挑战，其中包括五个专注于新型图像传感器和成像算法的曲目。在本文中，引入了RGBW关节融合和Denoise，这是五个曲目之一，其中一条致力于将Binning模式RGBW融合到拜耳。为参与者提供了一个新的数据集，其中包括70（培训）和15个（验证）高质量RGBW和拜耳对的场景。此外，对于每个场景，在24dB和42dB处提供不同噪声水平的RGBW。所有数据均在室外和室内条件下使用RGBW传感器捕获。最终结果使用客观指标，包括PSNR，SSIM}，LPIPS和KLD评估。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接，请访问https://github.com/mipi-challenge/mipi2022。

translated by 谷歌翻译

MIPI 2022 Challenge on Quad-Bayer Re-mosaic: Dataset and Report

Qingyu Yang , Guang Yang , Jun Jiang , Chongyi Li , Ruicheng Feng , Shangchen Zhou , Wenxiu Sun , Qingpeng Zhu , Chen Change Loy , Jinwei Gu

分类：计算机视觉

2022-09-15

随着移动平台上对计算摄影和成像的需求不断增长，在相机系统中开发和集成了高级图像传感器与新型算法的发展。但是，缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像（MIPI）的发展。为了弥合差距，我们引入了第一个MIPI挑战，其中包括五个专注于新型图像传感器和成像算法的曲目。在本文中，引入了QUAD Remosaic和Denoise，这是五个曲目之一，在完全分辨率上进行了四QFA插值向拜耳进行插值。为参与者提供了一个新的数据集，包括70（培训）和15个（验证）高品质四边形和拜耳对的场景。此外，对于每个场景，在0dB，24dB和42dB上提供了不同噪声水平的四边形。所有数据均在室外和室内条件下使用四边形传感器捕获。最终结果使用客观指标，包括PSNR，SSIM，LPIPS和KLD。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接，请访问https://github.com/mipi-challenge/mipi2022。

translated by 谷歌翻译

MIPI 2022 Challenge on RGB+ToF Depth Completion: Dataset and Report

Wenxiu Sun , Qingpeng Zhu , Chongyi Li , Ruicheng Feng , Shangchen Zhou , Jun Jiang , Qingyu Yang , Chen Change Loy , Jinwei Gu

分类：计算机视觉

2022-09-15

随着对移动平台上对计算摄影和成像的需求不断增长，在相机系统中开发和集成了高级图像传感器与相机系统中新型算法。但是，缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像（MIPI）的发展。为了弥合差距，我们介绍了第一个MIPI挑战，包括五个曲目，这些曲目着重于新型图像传感器和成像算法。在本文中，引入了RGB+TOF深度完成，这是五个曲目之一，其中一条介绍了RGB传感器和TOF传感器（带有点照明）的融合。为参与者提供了一个名为TetrasRGBD的新数据集，其中包含18k对高质量合成RGB+DEPTH训练数据和2.3k对来自混合源的测试数据。所有数据均在室内场景中收集。我们要求所有方法的运行时间都应在桌面GPU上实时。最终结果是使用客观指标和平均意见评分（MOS）主观评估的。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接，请访问https://github.com/mipi-challenge/mipi2022。

translated by 谷歌翻译

MIPI 2022 Challenge on Under-Display Camera Image Restoration: Methods and Results

Ruicheng Feng , Chongyi Li , Shangchen Zhou , Wenxiu Sun , Qingpeng Zhu , Jun Jiang , Qingyu Yang , Chen Change Loy , Jinwei Gu

分类：计算机视觉

2022-09-15

随着移动平台上对计算摄影和成像的需求不断增长，在相机系统中开发和集成了高级图像传感器与新型算法的发展。但是，缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像（MIPI）的发展。为了弥合差距，我们介绍了第一个MIPI挑战，包括五个曲目，这些曲目着重于新型图像传感器和成像算法。在本文中，我们总结并审查了MIPI 2022上的分配摄像头（UDC）图像恢复轨道。总共，成功注册了167名参与者，并在最终测试阶段提交了19个团队。在这项挑战中开发的解决方案在播放摄像头映像修复局上实现了最新的性能。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接，请访问https://github.com/mipi-challenge/mipi2022。

translated by 谷歌翻译

CuDi: Curve Distillation for Efficient and Controllable Exposure Adjustment

Chongyi Li , Chunle Guo , Ruicheng Feng , Shangchen Zhou , Chen Change Loy

分类：计算机视觉

2022-07-28

我们提出曲线蒸馏，CUDI，以进行有效且可控的暴露调整，而无需在训练过程中配对或未配对的数据。我们的方法从有效的低光图像增强方法零DCE继承了零引用学习和基于曲线的框架，并以其推理速度进一步提高了其推理速度，减少其模型大小以及扩展到可控的暴露调整。通过新颖的曲线蒸馏实现了改进的推理速度和轻量级模型，该曲线蒸馏通过高阶曲线的切线线近似于常规曲线框架中耗时的迭代操作。通过新的自我监督的空间暴露控制损失，可控制的暴露调整成为可能，该损失限制了输出的不同空间区域的暴露水平，即接近接触映射的亮度分布，以作为输入条件。与大多数只能纠正不渗透或过度曝光的照片的方法不同，我们的方法可以使用单个模型纠正未充分曝光和过度曝光的照片。值得注意的是，我们的方法还可以在输入条件曝光图的指导下在全球或本地调整照片的曝光水平，该图可以在推理阶段进行预定或手动设置。通过广泛的实验，我们表明我们的方法在真实场景中的快速，稳健性和灵活的性能吸引了最先进的方法。项目页面：https：//li-chongyi.github.io/cudi_files/。

translated by 谷歌翻译

Effective and Efficient Training for Sequential Recommendation Using Cumulative Cross-Entropy Loss

Fangyu Li , Shenbao Yu , Feng Zeng , Fang Yang

分类：机器学习

2023-01-03

Increasing research interests focus on sequential recommender systems, aiming to model dynamic sequence representation precisely. However, the most commonly used loss function in state-of-the-art sequential recommendation models has essential limitations. To name a few, Bayesian Personalized Ranking (BPR) loss suffers the vanishing gradient problem from numerous negative sampling and predictionbiases; Binary Cross-Entropy (BCE) loss subjects to negative sampling numbers, thereby it is likely to ignore valuable negative examples and reduce the training efficiency; Cross-Entropy (CE) loss only focuses on the last timestamp of the training sequence, which causes low utilization of sequence information and results in inferior user sequence representation. To avoid these limitations, in this paper, we propose to calculate Cumulative Cross-Entropy (CCE) loss over the sequence. CCE is simple and direct, which enjoys the virtues of painless deployment, no negative sampling, and effective and efficient training. We conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness and efficiency of CCE. The results show that employing CCE loss on three state-of-the-art models GRU4Rec, SASRec, and S3-Rec can reach 125.63%, 69.90%, and 33.24% average improvement of full ranking NDCG@5, respectively. Using CCE, the performance curve of the models on the test data increases rapidly with the wall clock time, and is superior to that of other loss functions in almost the whole process of model training.

translated by 谷歌翻译

Generalizable Black-Box Adversarial Attack with Meta Learning

Fei Yin , Yong Zhang , Baoyuan Wu , Yan Feng , Jingyi Zhang , Yanbo Fan , Yujiu Yang

分类：机器学习 | 计算机视觉

2023-01-01

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.

translated by 谷歌翻译

Spatiotemporal implicit neural representation for unsupervised dynamic MRI reconstruction

Jie Feng , Ruimin Feng , Qing Wu , Zhiyong Zhang , Yuyao Zhang , Hongjiang Wei

分类：计算机视觉

2022-12-31

Supervised Deep-Learning (DL)-based reconstruction algorithms have shown state-of-the-art results for highly-undersampled dynamic Magnetic Resonance Imaging (MRI) reconstruction. However, the requirement of excessive high-quality ground-truth data hinders their applications due to the generalization problem. Recently, Implicit Neural Representation (INR) has appeared as a powerful DL-based tool for solving the inverse problem by characterizing the attributes of a signal as a continuous function of corresponding coordinates in an unsupervised manner. In this work, we proposed an INR-based method to improve dynamic MRI reconstruction from highly undersampled k-space data, which only takes spatiotemporal coordinates as inputs. Specifically, the proposed INR represents the dynamic MRI images as an implicit function and encodes them into neural networks. The weights of the network are learned from sparsely-acquired (k, t)-space data itself only, without external training datasets or prior images. Benefiting from the strong implicit continuity regularization of INR together with explicit regularization for low-rankness and sparsity, our proposed method outperforms the compared scan-specific methods at various acceleration factors. E.g., experiments on retrospective cardiac cine datasets show an improvement of 5.5 ~ 7.1 dB in PSNR for extremely high accelerations (up to 41.6-fold). The high-quality and inner continuity of the images provided by INR has great potential to further improve the spatiotemporal resolution of dynamic MRI, without the need of any training data.

translated by 谷歌翻译

Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition

Yukun Feng , Ming Tu , Rui Xia , Chuanzeng Huang , Yuxuan Wang

分类：自然语言处理

2022-12-30

Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.

translated by 谷歌翻译